Search CORE

27 research outputs found

Parametric inference in the large data limit using maximally informative models

Author: Atwal Gurinder S.
Kinney Justin B.
Publication venue
Publication date: 13/12/2013
Field of study

Motivated by data-rich experiments in transcriptional regulation and sensory neuroscience, we consider the following general problem in statistical inference. When exposed to a high-dimensional signal S, a system of interest computes a representation R of that signal which is then observed through a noisy measurement M. From a large number of signals and measurements, we wish to infer the "filter" that maps S to R. However, the standard method for solving such problems, likelihood-based inference, requires perfect a priori knowledge of the "noise function" mapping R to M. In practice such noise functions are usually known only approximately, if at all, and using an incorrect noise function will typically bias the inferred filter. Here we show that, in the large data limit, this need for a pre-characterized noise function can be circumvented by searching for filters that instead maximize the mutual information I[M;R] between observed measurements and predicted representations. Moreover, if the correct filter lies within the space of filters being explored, maximizing mutual information becomes equivalent to simultaneously maximizing every dependence measure that satisfies the Data Processing Inequality. It is important to note that maximizing mutual information will typically leave a small number of directions in parameter space unconstrained. We term these directions "diffeomorphic modes" and present an equation that allows these modes to be derived systematically. The presence of diffeomorphic modes reflects a fundamental and nontrivial substructure within parameter space, one that is obscured by standard likelihood-based inference.Comment: To appear in Neural Computatio

arXiv.org e-Print Archive

Cold Spring Harbor Laboratory Institutional Repository

Equitability, mutual information, and the maximal information coefficient

Author: Atwal Gurinder S.
Kinney Justin B.
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 31/01/2013
Field of study

Reshef et al. recently proposed a new statistical measure, the "maximal information coefficient" (MIC), for quantifying arbitrary dependencies between pairs of stochastic quantities. MIC is based on mutual information, a fundamental quantity in information theory that is widely understood to serve this need. MIC, however, is not an estimate of mutual information. Indeed, it was claimed that MIC possesses a desirable mathematical property called "equitability" that mutual information lacks. This was not proven; instead it was argued solely through the analysis of simulated data. Here we show that this claim, in fact, is incorrect. First we offer mathematical proof that no (non-trivial) dependence measure satisfies the definition of equitability proposed by Reshef et al.. We then propose a self-consistent and more general definition of equitability that follows naturally from the Data Processing Inequality. Mutual information satisfies this new definition of equitability while MIC does not. Finally, we show that the simulation evidence offered by Reshef et al. was artifactual. We conclude that estimating mutual information is not only practical for many real-world applications, but also provides a natural solution to the problem of quantifying associations in large data sets

arXiv.org e-Print Archive

Cold Spring Harbor Laboratory Institutional Repository

PubMed Central

Kerfuffle: a web tool for multi-species gene colocalization analysis

Author: Aboukhalil Robert
Atwal Gurinder S.
Fendler Bernard
Publication venue
Publication date: 01/01/2013
Field of study

The evolutionary pressures that underlie the large-scale functional organization of the genome are not well understood in eukaryotes. Recent evidence suggests that functionally similar genes may colocalize (cluster) in the eukaryotic genome, suggesting the role of chromatin-level gene regulation in shaping the physical distribution of coordinated genes. However, few of the bioinformatic tools currently available allow for a systematic study of gene colocalization across several, evolutionarily distant species. Kerfuffle is a web tool designed to help discover, visualize, and quantify the physical organization of genomes by identifying significant gene colocalization and conservation across the assembled genomes of available species (currently up to 47, from humans to worms). Kerfuffle only requires the user to specify a list of human genes and the names of other species of interest. Without further input from the user, the software queries the e!Ensembl BioMart server to obtain positional information and discovers homology relations in all genes and species specified. Using this information, Kerfuffle performs a multi-species clustering analysis, presents downloadable lists of clustered genes, performs Monte Carlo statistical significance calculations, estimates how conserved gene clusters are across species, plots histograms and interactive graphs, allows users to save their queries, and generates a downloadable visualization of the clusters using the Circos software. These analyses may be used to further explore the functional roles of gene clusters by interrogating the enriched molecular pathways associated with each cluster.Comment: BMC Bioinformatics, In pres

arXiv.org e-Print Archive

Cold Spring Harbor Laboratory Institutional Repository

Springer - Publisher Connector

PubMed Central

Estimating mutual information and multi--information in large networks

Author: Atwal Gurinder S.
Bialek William
Slonim Noam
Tkacik Gasper
Publication venue
Publication date: 01/01/2005
Field of study

We address the practical problems of estimating the information relations that characterize large networks. Building on methods developed for analysis of the neural code, we show that reliable estimates of mutual information can be obtained with manageable computational effort. The same methods allow estimation of higher order, multi--information terms. These ideas are illustrated by analyses of gene expression, financial markets, and consumer preferences. In each case, information theoretic measures correlate with independent, intuitive measures of the underlying structures in the system

arXiv.org e-Print Archive

Cold Spring Harbor Laboratory Institutional Repository

Cell non-autonomous interactions during non-immune stromal progression in the breast tumor microenvironment

Author: Antoniou Eric
Atwal Gurinder S
Bastian Anja
Gao Qing
Gopalakrishna-Pillai Sailesh
Huang Yinghui J
Jin Ying
Lee Peter P
Sadagopan Narayanan
Utama Raditya
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 04/02/2019
Field of study

Summary The breast tumor microenvironment of primary and metastatic sites is a complex milieu of differing cell populations, consisting of tumor cells and the surrounding stroma. Despite recent progress in delineating the immune component of the stroma, the genomic expression landscape of the non-immune stroma (NIS) population and their role in mediating cancer progression and informing effective therapies are not well understood. Here we obtained 52 cell-sorted NIS and epithelial tissue samples across 37 patients from i) normal breast, ii) normal breast adjacent to primary tumor, iii) primary tumor, and iv) metastatic tumor sites. Deep RNA-seq revealed diverging gene expression profiles as the NIS evolves from normal to metastatic tumor tissue, with intra-patient normal-primary variation comparable to inter-patient variation. Significant expression changes between normal and adjacent normal tissue support the notion of a cancer field effect, but extended out to the NIS. Most differentially expressed protein-coding genes and lncRNAs were found to be associated with pattern formation, embryogenesis, and the epithelial-mesenchymal transition. We validated the protein expression changes of a novel candidate gene, C2orf88, by immunohistochemistry staining of representative tissues. Significant mutual information between epithelial ligand and NIS receptor gene expression, across primary and metastatic tissue, suggests a unidirectional model of molecular signaling between the two tissues. Furthermore, survival analyses of 827 luminal breast tumor samples demonstrated the predictive power of the NIS gene expression to inform clinical outcomes. Together, these results highlight the evolution of NIS gene expression in breast tumors and suggest novel therapeutic strategies targeting the microenvironment

Cold Spring Harbor Laboratory Institutional Repository

Absence of central tolerance in Aire-deficient mice synergizes with immune-checkpoint inhibition to enhance antitumor responses.

Author: Atwal Gurinder S
Benitez Asiel A
Gupta Namita T
Haxhinasto Sokol
Khalil-Agüero Sara
Murphy Andrew J
Nandakumar Anjali
Sleeman Matthew A
Zhang Wen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 08/07/2020
Field of study

The endogenous anti-tumor responses are limited in part by the absence of tumor-reactive T cells, an inevitable consequence of thymic central tolerance mechanisms ensuring prevention of autoimmunity. Here we show that tumor rejection induced by immune checkpoint blockade is significantly enhanced in Aire-deficient mice, the epitome of central tolerance breakdown. The observed synergy in tumor rejection extended to different tumor models, was accompanied by increased numbers of activated T cells expressing high levels of Gzma, Gzmb, Perforin, Cxcr3, and increased intratumoural levels of Cxcl9 and Cxcl10 compared to wild-type mice. Consistent with Aire's central role in T cell repertoire selection, single cell TCR sequencing unveiled expansion of several clones with high tumor reactivity. The data suggest that breakdown in central tolerance synergizes with immune checkpoint blockade in enhancing anti-tumor immunity and may serve as a model to unmask novel anti-tumor therapies including anti-tumor TCRs, normally purged during central tolerance

Cold Spring Harbor Laboratory Institutional Repository

Dynamic plasticity in coupled avian midbrain maps

Author: A. J. King
A. L. Yuille
Gurinder Singh Atwal
J. H. L. Pick
J. L. Schnapf
M. Rucci
S. F. Edwards
T. M. Cover
W. Bialek
Publication venue: 'American Physical Society (APS)'
Publication date
Field of study

Crossref

A framework for highly multiplexed dextramer mapping and prediction of T cell receptor sequences to antigen specificity.

Author: Atwal Gurinder S
Chen Calvin R
Choonoo Gabrielle
Deering Raquel
Dhanik Ankur
Dillon Myles
Gupta Namita T
Hawkins Peter G
He Jing
Jeong Se W
Liu Jinrui
Macdonald Lynn E
Thurston Gavin
Zhang Wen
Publication venue: 'American Association for the Advancement of Science (AAAS)'
Publication date: 01/05/2021
Field of study

T cell receptor (TCR) antigen-specific recognition is essential for the adaptive immune system. However, building a TCR-antigen interaction map has been challenging due to the staggering diversity of TCRs and antigens. Accordingly, highly multiplexed dextramer-TCR binding assays have been recently developed, but the utility of the ensuing large datasets is limited by the lack of robust computational methods for normalization and interpretation. Here, we present a computational framework comprising a novel method, ICON (Integrative COntext-specific Normalization), for identifying reliable TCR-pMHC (peptide-major histocompatibility complex) interactions and a neural network-based classifier TCRAI that outperforms other state-of-the-art methods for TCR-antigen specificity prediction. We further demonstrated that by combining ICON and TCRAI, we are able to discover novel subgroups of TCRs that bind to a given pMHC via different mechanisms. Our framework facilitates the identification and understanding of TCR-antigen-specific interactions for basic immunological research and clinical immune monitoring

Cold Spring Harbor Laboratory Institutional Repository

Fine-scale detection of population-specific linkage disequilibrium using haplotype entropy in the human genome

Author: A Carvajal-Rodríguez
AF Reis
Alexei Vazquez
Arnold J Levine
BF Voight
BL Niell
D Gezen-Ak
DC Crawford
DE Reich
EJ Parra
G Ménasché
G Ribas
GS Atwal
Gurinder Atwal
Haijian Wang
Hideaki Mizuno
International HapMap Consortium
J Costas
J McGrath
J Zhang
JD Simmons
JK Pickrell
JM Valdivielso
KM Teshima
LE Matesic
M Kuningas
M Nothnagel
MV Rockman
NG Jablonski
PC Sabeti
PC Sabeti
PC Sabeti
R Bouillon
R Nielsen
S Myles
S Myles
SA Tishkoff
SH Williamson
T Nakajima
WP Walker
Y Picornell
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The creation of a coherent genomic map of recent selection is one of the greatest challenges towards a better understanding of human evolution and the identification of functional genetic variants. Several methods have been proposed to detect linkage disequilibrium (LD), which is indicative of natural selection, from genome-wide profiles of common genetic variations but are designed for large regions. Results To find population-specific LD within small regions, we have devised an entropy-based method that utilizes differences in haplotype frequency between populations. The method has the advantages of incorporating multilocus association, conciliation with low allele frequencies, and independence from allele polarity, which are ideal for short haplotype analysis. The comparison of HapMap SNPs data from African and Caucasian populations with a median resolution size of ~23 kb gave us novel candidates as well as known selection targets. Enrichment analysis for the yielded genes showed associations with diverse diseases such as cardiovascular, immunological, neurological, and skeletal and muscular diseases. A possible scenario for a selective force is discussed. In addition, we have developed a web interface (ENIGMA, available at <url>http://gibk21.bse.kyutech.ac.jp/ENIGMA/index.html</url>), which allows researchers to query their regions of interest for population-specific LD. Conclusion The haplotype entropy method is powerful for detecting population-specific LD embedded in short regions and should contribute to further studies aiming to decipher the evolutionary histories of modern humans.</p

Crossref

Cold Spring Harbor Laboratory Institutional Repository

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central